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Abstract 

We study a hypothesis testing problem in which data is compressed distributively and sent to a detector that seeks to decide 
between two possible distributions for the data. The aim is to characterize all achievable encoding rates and exponents of the 
type 2 error probability when the type 1 error probability is at most a fixed value. For related problems in distributed source 
coding, schemes based on random binning perform well and often optimal. For distributed hypothesis testing, however, the 
use of binning is hindered by the fact that the overall error probability may be dominated by errors in binning process. We 
show that despite this complication, binning is optimal for a class of problems in which the goal is to "test against conditional 
independence." We then use this optimality result to give an outer bound for a more general class of instances of the problem. 

Keywords: distributed hypothesis testing, binning, test against conditional independence, Quantize-Bin-Test scheme, 
Gaussian many-help-one hypothesis testing against independence, Gel'fand and Pinsker hypothesis testing against indepen- 
dence, rate-exponent region, outer bound. 

1 Introduction 

Consider the problem of measuring the traffic on two links in a communication network and inferring whether the two links 
are carrying any common traffic lUlEl. Evidently, this inference cannot be made by inspecting the measurements from one of 
the links alone, except in the extreme situation in which that link carries no traffic at all. Thus it is necessary to transport the 
measurements from one of the finks to the other, or to transport both measurements to a third location. The measured data is 
potentially high-rate, however, so this transportion may require that the data be compressed. This raises the question of how 
to compress data when the goal is not to reproduce it per se, but rather to perform inference. A similar problem arises when 
inferring the speed of a moving vehicle from the times that it passes certain waypoints. 

These problems can be modeled mathematically by the setup depicted in Fig. [T] which we call the L-encoder general 
hypothesis testing problem. A vector source {Xi, . . . , X]^,Y) has different joint distributions Pxi,...,Xl-Y and Qxi....,Xl,y 
under two hypotheses Hq and Hi, respectively. Encoder I observes an i.i.d. string distributed according to Xi and sends a 
message to the detector at a finite rate of Ri bits per observation using a noiseless channel. The detector, which has access 
to an i.i.d. string distributed according to Y, makes a decision between the hypotheses. The detector may make two types 
of error: the type 1 error (Hq is true but the detector decides otherwise) and the type 2 error (Hi is true but the detector 
decides otherwise). The type 1 error probability is upper bounded by a fixed value. The type 2 error probability decreases 
exponentially fast, say with an exponent E, as the length of the i.i.d. strings increases. The goal is to characterize the rate- 
exponent region of the problem, which is the set of all achievable rate-exponent vectors . . . , Rl,E), in the regime in 
which the type 1 error probability is small. This problem was first introduced by Berger O (see also H) and arises naturally 
in many applications. Yet despite these applications, the theoretical understanding of this problem is far from complete, 
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1 



especially when compared with its sibling, distributed source coding, where random binning has been shown to be a key 
ingredient in many optimal schemes. 




Figure 1: L-encoder general hypothesis testing 

Note that if one of the variables in the set {Xi, .... Xl, Y) has a different marginal distribution under Pxi,....Xi,.y and 
Qxi,...,Xl,Y' then one of the terminals can detect the underlying hypothesis with an exponentially-decaying type 2 error 
probability, even without receiving any information from the other terminals, and could communicate this decision to other 
terminals by broadcasting a single bit. Motivated by the applications mentioned above, we shall focus our attention on the 
case in which the variables Xi, . . . , Xl, Y have the same marginal distibutions under both hypotheses. 

Ahlswede and Csiszar Q studied a special case of this problem in which L = 1. They presented a scheme in which the 
encoder sends a quantized value of Xi to the detector which uses it to perform the test with the help of Y. They showed that 
their scheme is optimal for a "test against independence." Their scheme was later improved by Han ||6] and Shimokawa-Han- 
Amari 12J. In the latter improvement, the encoder first quantizes Xi, then bins the quantized value using a Slepian and Wolf 
encoder 0. The detector first decodes the quantized value with the help of Y and then performs a likelihood ratio test. In this 
scheme, type 2 errors can occur in two different ways: the binning can fail so that the receiver decodes the wrong codeword 
and therefore makes an incorrect decision, or the true codeword can be decoded correctly yet be atypically distributed with 
Y, again resulting in an incorrect decision. Moreover, there is a tension between these two forms of error. If the codeword is 
a high fidelity representation of Xi, then binning errors are likely, yet the detector is relatively unlikely to make an incorrect 
decision if it decodes the codeword correctly. If the codeword is a low fidelity representation, then binning errors are unlikely, 
but the detector is more likely to make an incorrect decision when it decodes correctly. 

Fig. |2] illustrates this tradeoff for a fixed test channel P(j-^^Xi used for quantization. All mutual information quantities 
are computed with respect to P. P2{Ui) and pl{Ui) are the exponents associated with type 2 errors due to binning errors 
and assuming correct decoding of the codeword, respectively. Formulas for each are available in |4|. For low rates, binning 
errors are common and P2{Ui) dominates the overall exponent. For high rates, binning errors are uncommon and pl{Ui) 
dominates the overall exponent. To achieve the overall performance, the test channel should be chosen so that these two 
exponents are equal; if they are not, then making the test channel slightly more or less noisy will yield better performance. A 
similar tradeoff arises in the analysis of error exponents of binning-based schemes for the Wyner-Ziv problem ll9l [TOl [TTl [T2l 
and in the design of short block-length codes for Wyner-Ziv or joint source-channel coding. Evidently the benefit accrued 
from binning is reduced when one considers error exponents, as opposed to when the design criterion is vanishing error 
probability or average distortion, because the error exponent associated with the binning process itself may dominate the 
overall performance. 

The Shimokawa-Han-Amari scheme uses random, unstructured binning. It is known from the lossless source coding 
Uterature that structured binning schemes can strictly improve upon unstructured binning schemes in terms of the error expo- 
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Figure 2: Shimokawa-Han-Amari achievable region for a fixed channel Pui_\Xi 

nents lfT3l fT4l [TSl . Thus, two questions naturally arise: 

1. Is the tradeoff depicted in Fig. |2] fundamental to the problem or an artifact of a suboptimal scheme? 

2. Can the scheme be improved by using structured binning? 

We conclusively answer both questions and show that unstructured binning is optimal in several important cases. We begin 
by considering a special case of the problem that we call L-encoder hypothesis testing against conditional independence. Here 
Y is replaced by a three-source (X^+i, Z) such that Z induces conditional independence between (Xi, . . . , X^, X^+i) 
and Y under Hi. In addition, (Xi, . . . , Xl,Xl+i, Z) and (y, Z) have the same distributions under both hypotheses. This 
problem is a generahzation of the single-encoder test against independence studied by Ahlswede and Csiszar jS], 

For this problem we provide an achievable region, based on a scheme we call Quantize-Bin-Test, that reduces to the 
Shimokawa-Han-Amari region for L = 1 yet is significantly simpler We also introduce an outer bound similar to the outer 
bound for the distributed rate-distortion problem given by Wagner and Anantharam [ 1 6 1 . The idea is to introduce an auxiliary 
random variable that induces conditional independence between the sources. This technique of obtaining an outer bound has 
been used to prove results in many distributed source coding problems lfT6l [TTl [TSl [T9l l20l |2T1 . 

The inner (achievable) and outer bounds are shown to match in three examples. The first is the case in which there is only 
one encoder (L = 1). Although this problem is simply the conditional version of the test against independence studied by 
Ahlswede and Csiszar Q, the conditional version is much more complicated due to the necessary introduction of binning. 
It follows that the Shimokawa-Han-Amari scheme is optimal for L = 1, providing what appears to be the first nontrivial 
optimality result for this scheme. This problem arises in detecting network flows in the presence of common cross-traffic that 
is known to the detector. Here Xi represents the network traffic measured at a remote location, Y is the traffic measured at the 
detector, and Z represents the cross-traffic. The goal is to detect the presence of common traffic beyond Z, i.e., to determine 
whether Z captures all of the dependence between Xi and Y . 

The second is a problem inspired by a result of Gel'fand and Pinsker ||22l|. We refer to this as the Gel'fand and Pinsker 
hypothesis testing against independence problem, the setup of which is shown in Fig. |3] Here X^+i and Z are deterministic 
and there is a source X which under Hq is the minimum sufficient statistic for Y given {Xi , . . . , Xl ) such that Xi , . . . , Xl , Y 
are conditionally independent given X. We characterize the set of rate vectors {Ri, . . . , R^) that achieve the centralized 
exponent I{X; Y). We show that the Quantize-Bin-Test scheme is optimal for this problem. 
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Figure 3: Gel'fand and Pinsker hypothesis testing against independence 



The third is the Gaussian many-help-one hypothesis testing against independence problem, the setup of which is shown 
in Fig. |4] Here the sources are jointly Gaussian and there is another scalar Gaussian source X observed by the main encoder 
which sends a message to the detector at a rate R. The encoder observing Xi is now referred to as the helper I. We characterize 
the rate-exponent region of this problem in a special case when Xi, . . . , Xl^Y are conditionally independent given X. We 
use results on related source coding problem by Oohama ||23]| and Prabhakaran et al. Il24l to obtain an outer bound, which we 
show is achieved by the Quantize-Bin-Test scheme. 




Figure 4: Gaussian many-help-one hypothesis testing against independence 

For all three examples, we obtain the solution by observing that the relevant error exponent takes the form of a mutual 
information, and thereby relate the problem to a source-coding problem. This correspondence was first observed by Ahlswede 
and Csiszar fS). Tian and Chen later applied it in the context of successive refinement ||251 . These three conclusive results 
enable us to answer both of the above questions. Because the Shimokawa-Han-Amari scheme is optimal for L = 1, the 
tradeoff that it entails, depicted in Fig. [2j must be fundamental to the problem. Moreover, as both the Shimokawa-Han-Amari 
and Quantize-Bin-Test schemes do not use structured binning, we conclude that it is not necessary for this problem, at least 
in the special case considered here. 

As a byproduct of our results, we obtain an outer bound for a more general class of instances of the distributed hypothesis 
testing problem. This is the first nontrivial outer bound for the problem, and numerical experiments show that it is quite close 
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to the existing achievable regions in many cases. 

The rest of the paper is organized as follows. In Section 2, we introduce the notation used in the paper. We give the 
mathematical formulation of the L-encoder general hypothesis testing problem in Section 3. Section 4 is devoted to the L- 
encoder hypothesis testing against conditional independence problem. Section 5 is on the special case in which there is only 
one encoder The Gel'fand and Pinsker hypothesis testing against independence problem is studied in Section 6. The Gaussian 
many-help-one hypothesis testing against independence problem is studied on Section 7. Finally, we present the outer bound 
for a class of the general problem in Section 8. 



2 Notation 

We use upper case to denote random variables and vectors. Boldface is used to distinguish vectors from scalars. Arbitrary 
realizations of random variables and vectors are denoted in lower case. For a random variable X, X" denotes an i.i.d. vector 
of length n, denotes its ith component, : j) denotes the ;th through jth components, and denotes all but 

the ;th component. For random variables X and Y, we use cr^ and cr^|y to denote the variance of X and the conditional 
variance of X given Y, respectively. The closure of a set A is denoted by ^. |/| denotes the cardinality of the range of a 
function /. 1a denotes the indicator function of an event A. The determinant of a matrix K is denoted by det(K). The 
notation denotes max(x, 0). All logarithms are to the base 2. is used to denote the positive orthant in L-dimensional 
Euclidean space. The notation X ^ Y ^ Z means that X, Y, and Z form a Markov chain in this order. For < p < 1, 
Hf)(j)) denotes the binary entropy function defined as 

Hb{p) = -plogp- (1 -p) log(l -_p). 
All entropy and mutual information quantities are under the null hypothesis, Hq, unless otherwise stated. 



3 L-Encoder General Hypothesis Testing 
3.1 Problem Formulation 

Let {Xi, . . . , Xl, Y) be a generic source taking values in Y[f=i ^ 3^' where Xi, . . . , Xj^, and y are alphabet sets of 
Xi, . . . , Xl^ and Y, respectively. The distribution of the source is Pxi...XlY under the null hypothesis Hq and is Qxi...XlY 
under the alternate hypothesis Hi, i.e., 

Hq : Pxi...XlY 
Hi : Qxi...X[^Y- 

Let {(X"(z), . . . , ^" (*))}"=! ™ sequence of random vectors with the distribution at a single stage same as 

that of (Xi, . . . , X]^, Y). We use C to denote the set {1, . . . , L}. For S C C, S'^ denotes the complement set £ \ S' and 
X^(i) denotes (X"(i))/gs- When S — C,we simply write X2(i) as X"(i). Likewise when S — {I}, we write X|;j(i) and 
X|;je (i) as and XJl («), respectively. Similar notation will be used for other collections of random variables. 

As depicted in Fig. [T] the encoder I observes X", then sends a message to the detector using an encoding function 



i,...,m/") 



Y" is available at the detector, which uses it and the messages from the encoders to make a decision between the hypotheses 
based on a decision rule 

Hq if (mi,..., mi, y") is in a 
Hi otherwise. 



(mi, . . .,mL,y'"') 
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where 

1=1 

is the acceptance region for Hq. The encoders /^^"^ and the detector ^"^"^ are such that the type 1 error probability does not 
exceed a fixed e in (0, 1), i.e., 

and the type 2 error probabiUty does not exceed rj, i.e., 
Definition 1. A rate-exponent vector 

{n,E) = {Ri,...,Rl,E) 

is achievable /or a fixed e if for any positive 5 and sufficiently large n, there exists encoders fj-^^^ and a detector t^^"-* such that 

— log m/"-* < Ri + 5 for all I in C, and 
n 

— — log ?]> E — 6. 
n 

Let TZ^ be the set of all achievable rate-exponent vectors for a fixed e. The rate-exponent region TZ is defined as 
Our goal is to characterize the region Tl. 

3.2 Entropy Characterization of the Rate-Exponent Region 

We start with the entropy characterization of the rate-exponent region. We shall use it later in the paper to obtain inner and 
outer bounds. Define the set 



for all I 'm L, and 



.J ■ (1) 



" in... 

where 

We have the following Proposition. 
Proposition \. TZ ^ 7^*. 

The proof of Proposition 1 is a straight-forward generalization of that of Theorem 1 in Q and is hence omitted. Ahlswede 
and Csiszar Q showed that for L — \, the strong converse holds, i.e., T^.^ is independent of e. Thus, TZ^, is essentially 
a characterization for both TZ and TZ^. While we expect this to hold for the problem under investigation too, we shall not 
investigate it here. We next study a class of instances of the problem before returning to the general problem in Section 8. 
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4 L-Encoder Hypothesis Testing against Conditional Independence 

We consider a class of instances of the general problem, referred to as the L-encoder hypothesis testing against conditional 
independence problem, and obtain inner and outer bounds to the rate-exponent region. These bounds coincide and characterize 
the region completely in some cases. Moreover, the outer bound for this problem can be used to give an outer bound for a 
more general class of problems, as we shall see later 

Let Xi^+i and Z be two generic sources taking values in alphabet sets Xl+i and Z, respectively such that (X,Xx,+i) 
and Y are conditionally independent given Z under Hi, and the distributions of (X, X^+i, Z) and (F, Z) are the same under 
both hypotheses, i.e., 

^0 : Pii.XL+iY\zPz 
Hi ■ PxXl+i\zPy\zPz- 

The problem formulation is the same as before with Y replaced by {X^^i, Z,Y) in it. The reason for focusing on this 
special case is that the relative entropy in (1) becomes a mutual information, which simplifies the analysis. Let T?,*-^^ be the 
rate-exponent region of this problem. Here "C/" stands for conditional independence. Let 



where 



for all lin C, and 



We have the following corollary as a consequence of Proposition 1 . 
Corollary 1. U'^^ = TZ^. 

With mutual information replacing relative entropy, the problem can be analyzed using techniques from distributed rate- 
distortion. In particular, both inner and outer bounds for that problem can be applied here. 

4.1 Quantize-Bin-Test Inner Bound 

Our inner bound is based on a simple scheme which we call the Quantize-Bin-Test scheme. In this scheme, encoders, as in 
the Shimokawa-Han-Amari scheme, quantize and then bin their observations, but the detector now performs the test directly 
using the bins. The inner bound obtained is similar to the generalized Berger-Tung inner bound for distributed source coding 
ll26l l27l l28l . Let be the set of finite-alphabet random variables A; — {Ui, . . . ,Ul,T) satisfying 

(CI) T is independent of (X, Xl+i,Y, Z), and 

(C2) Ui 4^ (Xi , T) ^ (U,. , Xie , , Y, Z) for all / in £. 

Define the set 

n't\\)^UB.,E):Y,Ri>l0^s\^s\^s^.XL+uZ,T) for all 5 C £, and 

E<I{Y-\],Xl+i\Z,T) 
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and let 

n^' ^ U 7^^(A,). 

The following lemma asserts that 7?.p^ is computable and closed. 

Lemma 1. (a) TZf^ remains unchanged if we impose the following cardinality bound on (U, T) in 

H\ <\Xi\+2^ - I for all I in £, and 
in < 2^. 

(b) 7^f ^ is closed. 

The proof of Lemma 1 is presented in Appendix A. Although the cardinality bound is exponential in the number of 
encoders, one can obtain an improved bound by exploiting the contra-polymatroid structure of TZf^ Il29ll30l . We do not do 
so here because it is technically involved and we just want to prove that TZf^ is closed. The following theorem gives an inner 
bound to the rate-exponent region. 

Theorem 1. 7^f ^ C 7^'^^. 

Theorem 1 is proved in Appendix B. 

Remark 1: Although our inner bound is stated for the special case of the test against conditional independence, it can be 
extended to the general case. But, the inner bound thus obtained will be quite complicated, with competing exponents, and it 
is not needed in this paper. 

It is worth pointing out that the Quantize-Bin-Test scheme is in general suboptimal for problems in which encoders' 
observations have common randomness, i.e., there exists deterministic functions of encoders' observations that is common to 
encoders. However, it is straightforward to generalize this scheme by using the idea from the common-component scheme for 
distributed source coding problems lISTl . 

4.2 Outer Bound 

The outer bound is similar to the outer bound for the distributed rate-distortion problem given by Wagner and Anantharam 
lfT6l . Let Aq be the set of finite-alphabet random variables Aq = (U, T) satisfying 

(C3) {W, T) is independent of (X, Xl+i, Z), and 

(C4) Ui O {Xi ,W,T)^ (U,c , X,c ,Xl+i, r, Z) for all I in £, 

and let x be the set of finite-alphabet random variable X such that Xi , . . . , Xi^ , Xi^^i, Y are conditionally independent given 
{X, Z). Note that x is nonempty because it contains (X, For a given X'mx ^nd Ao in Aq, the joint distribution of X, 

(X, v'Cl+i, Y, Z), and Ao satisfy the Markov condition 

xo(x,XL+i,r,z)f^Ao. 

Wagner and Anantharam lfT6l refer to this condition as "Markov coupling" between X and Aq. Define the set 

7^^^(X, Ao) ^1 (R, i;) : ^ i?, > / {X- UsIUse , X^+i , Z, T) + ^ / (Xi- Vi\X, W, Xl+u Z, T) for all 5 C £, and 

les i€S 

E<I{Y;IJ,Xl+i\Z,T)\. 
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Also let 

n u T^o'ix^x,)- 

We have the following outer bound to the rate-exponent region. 

Theorem 2. Tl'i^ C and therefore Tl^^ C . 

The proof of the first inclusion is presented in Appendix C. The first inclusion and Corollary 1 imply the second inclusion. 
The next three sections provide examples in which the inner and outer bounds coincide. In Section 8, we will see how to 
extend the outer bound to the general problem. 



5 1 -Encoder Hypothesis Testing against Conditional Independence 

In this section, we study a special case in which L = 1. We prove that the Quantize-Bin-Test scheme is optimal for this 
problem. We also prove that the Shimokawa-Han-Amari inner bound coincides with the Quantize-Bin-Test inner bound, 
estabUshing that the Shimokawa-Han-Amari scheme is also optimal. 

5.1 Rate-Exponent Region 

Theorem 3. For this problem, the rate-exponent region 

n^i = 7^ = nf' (2) 

= 'Rp^ = I , E) : there exists Ui such that 
Ri>I{X,;Ui\X2,Z), 

E<I{Y;UuX2\Z), (3) 
< + 1, and 

Ui ^ Xi ^ iX2,Y, Z)}. 

Proof. To show (2), it suffices to show that 

because TZf^ is closed from Lemma 1(b). Consider {Ri,E) in TZ^^ . Take X = X2. It is evident that X2 is in %. Then there 
exists Ao = {Ui, W, T) in Aq such that {Ri,E) is in 7^^^(X2, Ac), i.e., 

Ri > I{X2; Ui 1X2, Z, T) + I{X^-Ui\X2,Z, W, T) 
= I{Xv,U^\X2,Z,W,T), 



and 



E<I{Y-Ui,X2\Z,T) 

= H{Y\Z,T) - H{Y\Ui,X2,Z,T) 

<H{Y\Z,W,T)-H{Y\Ui,X2,Z,W,T) (4) 
= I{Y;UuX2\Z,W,T), 
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where (4) follows from conditioning reduces entropy and the fact that {Y, Z) is independent of {W, T). If we set T — {W, T), 
then it is easy to verify that — (C/i , T) is in and we have 

Ri > IiXi;Ui\X2,Z,f), and (5) 
E <I{Y;Ui,X2\Z,f). (6) 

Therefore, E) is in Tlf^{Xi), which implies that {Ri, E) is in 7?.p^. This completes the proof of (2). 
To prove (3), it suffices to show that 

The reverse containment immediately follows if we restrict T to be deterministic in the definition of TZf^. Continuing from 
the proof of (2), let di = ( t/i , T) . Since {Ui, T) is in A^, we have that T is independent of {Xi, X2, Y, Z) and that 

Ui^if,X,)^iX2.Y,Z). 

Both together imply that 

U^Xi^ {X2,Y,Z). 

We next have from (5) that 

Ri>IiXi;Ui\X2,Z,f) 

= I{Xi- Ui\X2, Z, f) + I{Xi-f\X2,Z) (7) 

= I{X^-Ui,f\X2,Z) 

^I{Xi-Ui\X2,Z), 

where (7) follows because T is independent of [Xi, X2,Y, Z). And (6) similarly yields 

E<I{Y;Ui,X2\Z). 

Using the support lemma Ii32i Lemma 3.4, pp. 310] as in the proof of Lemma 1(a), we can obtain the cardinaUty bound 

\Ui\ < + 

We thus conclude that (i?i , E) is in H^^. □ 
5.2 Optimality of Shimokawa-Han-Amari Scheme 

The Shimokawa-Han-Amari scheme operates as follows. Consider a test channel Pu-^^Xi, ^ sufficiently large block length n, 
and a > 0. Let Ri — I{Xi; Ui) + a. To construct the codebook, we first generate 2"^^ independent codewords C/f , each 
according to YVi=i {uu), and then distribute them uniformly into 2"^^ bins. The codebook and the bin assignment are 
revealed to the encoder and the detector. The encoder first quantizes X" by selecting a codeword [/" that is jointly typical 
with it. With high probability, there will be at least one such codeword. The encoder then sends to the detector the index of the 
bin to which the codeword ?7" belongs. The joint type of (A"". ?7") is also sent to the detector, which requires zero additional 
rate asymptotically. The detector finds a codeword C/f in the bin that minimizes the empirical entropy iJ(C/f , F"). It then 
performs the test and declares Hf) if and only if both (AT", J7") and (F", C/") are jointly typical under Hq. The inner bound 
thus obtained is as follows. Define 

AiRi)^{Ui:Ri>IiXuUi\X2,Y,Z), ^ X^ ^ {X2,Y, Z), and < jA'il + l} 

B{Ui) ^i^Pjy^^^^^Y^ : P^^^^ = Pu,x, and Pox.yz = Pu^x^yz} 

C{Ui) ^[P^^^^^^y^ : P^^^^ = Pu,x,, Px^YZ = Px.YZ, and H{U,\X2,Y,Z) > H {U,\X2,Y, Z)]. 
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I{XuUi\X2,Y,Z) I{X,:Ui\X2,Z) I{XuUi) 

Ri > 



Figure 5: Shimokawa-Han-Amari achievable region for a fixed Pu-^\Xi 



In addition, define the exponents 



Pt(C/i) = 



DP, 



U1X1X2YZ 



foo if Ri> I{Ui;Xi 



\Pu1\X1Px1X2\zPY\zPz) 



P2{Ui) otherwise 
P2{Ui) = [Ri - /(Xi; C/1IX2, r, Z)] + 



^^"^ ,rn, ^^^^0^x,X2Yz\\Pu,\x,Px,X2\zPy\zPz) 

-^C/i Xi X2 5^2 fc*- V^^l J 



Finally, define 



EsHA {Ri ) = max min {pi (C/i ) , (C/i ) ) . 

UieA{Ri} 



Recall that ^2(^1) Pi{Ui) are the exponents associated with type 2 errors due to binning errors and assuming correct 
decoding of the codeword, respectively. 

Theorem 4. /[7^ , E) is in the rate-exponent region if 

E < Esha[Ri). 

Fig. |5]shows the Shimokawa-Han-Amari achievable exponent as a function of the rate assuming a fixed channel Pu^^Xi is 
used for quantization. This is simply Fig. |2]particularized to the 1-encoder hypothesis testing against conditional independence 
problem. For rates i?i > I{Xi, Ui\X2, Z), pl{Ui) dominates ^2(^1) '^^'^ there is no penalty for binning at these rates as 
the exponent stays the same. Therefore, we can bin all the way down to the rate i?i = I{Xi, Ui\X2, Z) without any loss 
in the exponent. However, if we bin further at rates i?i in [I{Xi, Ui\X2,Y, Z), I{Xi, Ui\X2, Z)), then P2{Ui) dominates 
p\{Ui), the exponent decreases linearly with Ri, and the performance deteriorates all the way down to a point at which the 
message from the encoder is useless. At this point, the binning rate i?i equals I{Xi, Ui\X2, Y, Z) and the exponent equals 
I{Y; X2\Z), which is the exponent when the detector ignores the encoder's message. This competition between the exponents 
makes the optimality of the Shimokawa-Han-Amari scheme unclear. We prove that it is indeed optimal by showing that the 
Shimokawa-Han-Amari inner bound simplifies to the Quantize-Bin-Test inner bound, which by Theorem 3 is tight. Let us 
define 

X(i?i)^{c/i :i?i >/(Xi;f/i|X2,Z), [/ioXio(X2,y,Z), and < IA-jI + l} 
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and 



Eqbt{Ri)= max IiY;Ui,X2\Z). 

UieA*{Ri) 



We have the following theorem. 

Theorem 5. If , E) is in the rate-exponent region, then 

E < Eqbt{Ri) = Esha{R{)- 
Proof. The inequality follows from Theorem 3. To prove the equahty, it is sufficient to show that 

Esha{Ri) > Eqbt{R\)- 

The reverse inequaUty follows from Theorem 3 and 4. Since conditioning reduces entropy and any U\ in A*{R\) satisfies the 
Markov chain 

we have 

i?i > I{Xi-Ui\X2,Z) 

= H{Ui 1^2, Z) - H{Ui\XiX2, Z) 

> H{Ui\X2, Y, Z) - H{Ui\X^X2, Y, Z) 

= I{X^-U^\X2,Y,Z), 

which means that J7i is in A{Ri). Hence, A* (Ri) C A{Ri). This implies that 

Esha{Ri)= max min {pl{Ui) , pUUi)) 

UieA{Ri) 

> max min (pt (C/i), p^(C/i)) . (8) 

UieA'iRi) 

Now the objective of the optimization problem in the definition of Pi{Ui) can be lower bounded as 

D{Pu,x,x.yz\\Pu,\x^Px,x.\zPy\zPz) > D{P^^^^y^\\Pu,x.\zPY\zPz) 

= D{Pu,x,yz\\Pu,x,\zPy\zPz) 
= I{Y-Ui,X2\Z). 

The lower bound is achieved by the distribution PuiX2YzPxi\UiX2Z in B{Ui). Therefore, 

pl{U,) = I{Y;U„X2\Z). 
Similarly, we can lower bound the optimization problem in the definition of p2(C^i) as 

D{Pu,x,X2Yz\\Pu,\x,Px,X2\zPy\zPz) > D{P^^y^\\Px2\zPY\zPz) 

= D{Px2Yz\\Px2\zPy\zPz) 
= I{Y;X2\Z), 

and the lower bound is achieved by the distribution Px2YzPuiXi\X2Z in C([/i). Therefore, 

P2{Ui) = [Ri-I{X,;Ui\X2,Y,Z)]+ +I{Y;X2\Z). 



12 



Consider any Ui in A*{Ri). If Ri > I{Xi; Ui), then 



min {pim,p;{Ui))^pl{U,) 



I{Y;Ui,X2\Z). 



(9) 



Andif /(Xi;f7i) > i?i > /(Xi; J7i|X2, Z), then 



min {pl{U,),p;{U^)) 



= min {I(Y; Ui,X2\Z),R,~ I{X,;Ui\X2,Y, Z) + I{Y; X2\Z)) 

> min {I{Y; Ui, X^lZ), I{Xi;Ui\X2, Z) - I{X^- Ui\X2,Y, Z) + I{Y;X2\Z)) 

= min {l{Y;Ui,X2\Z),I{Y;Ui\X2,Z)+IiY;X2\Z)) 

= min {liY;Ui,X2\Z),I{Y;Ui,X2\Z)) 

= I{Y;Ui,X2\Z). 



(10) 



Now (8) through (10) imply 



Esha{Ri)> max I{Y;Ui,X2\Z) 



= Eqbt{Ri)- 



Theorem 5 is thus proved. 



□ 



6 Gel'fand and Pinsker Hypothesis Testing against Independence 

We now consider another special case, which we call the Gel'fand and Pinsker hypothesis testing against independence 
problem, because it is related to the source coding problem studied by Gel'fand and Pinsker ll22ll . 

Suppose that X^+i and Z are deterministic and suppose there exists a function of Xi, . . . , Xl, say X, such that under 



(C5) Xi, ..^ Xl, Y are conditionally independent given X, and 

(C6) for any finite-alphabet random variable U such that Y ^ X ^ U and Y ^ U ^ X, v/e have H{X\U) — 0. 

Conditions (C5) and (C6) imply that under Hq, X is a minimal sufficient statistic for Y given X such that Xi, . . . , Xl, Y are 
conditionally independent given X. We shall characterize the centralized rate region, the set of rate vectors that achieve the 
centralized type 2 error exponent /(X; Y) — I{X; Y). More precisely, we shall characterize the set 



denoted by TZ^^{l{X; Y)). We define TZf^{l{X; Y)) and TZ^'{l{X; Y)) similarly. We need the following lemma. 
Lemma 2. Condition (C6) is equivalent to 

( C7) For any positive e, there exists a positive S such that for all finite-alphabet random variables U such that Y X U 
andI{X-Y\U) < 5, we have H{X\U) < e. 

The proof of Lemma 2 is presented in Appendix D. Let us define a function 



4>{5) = infje : for all finite-alphabet U such that Y ^ X ^ U I{X; Y\U) < S, we have H{X\U) < e 
It is clear that (p is continuous at zero with the value ip{0) = 0. We have the following theorem. 



Ho, 



{R:{R,I{X;Y))eTZ^'} 
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Theorem 6. For this problem, the centralized rate region 

n'''{l{X;Y)) =nf'{l{X;Y)) =ncT(^X;Y)). 

Proof. It suffices to show that 

W{l{X;Y))cnf'{l{X-Y)). 



Consider any R in TZ^^(l{X; Y)), any positive 6, and X defined as above. Then there exists Ao = (U, W, T) in Aq such 
that {Ri+6,...,Rl + 6, I{X; Y) - 5) is in 7^^^(X, A^,), i.e., 

^(i?;+<5) >/(X;Us|Usc,r) + ^/(Xi;t/(|X,V^,r) forall 5C/:, and (11) 
les les 

I{X;Y) - 6 < I{Y;V\T). (12) 
We have the Markov chain 

FoXo (U,T), 

which imphes 

I{X; y|U, T) = H{Y\V, T) - H{Y\X, U, T) 
= H{Y\V,T)-H{Y\X) 
= I{X;Y)-I{Y;\J\T) 
<S, 

where the last inequality follows from (12). Therefore, by the definition of <p function 

H{X\V,T)<(l){d). (13) 

Now 

I{X; UsIUsc, T) = H{X\lJs^ , T) - H{X\U, T) 

>H{X\\Js^,W,T)-m (14) 
>/(X;Us|Usc,VF,r)-</.((5), 

where (14) follows from (13) and the fact that conditioning reduces entropy. This together with (11) imphes 

les les 

= I{X; Us |Usc ,W,T) + 7(Xs; Us|Usc , X, W, T) 

= 7(X,Xs;Us|Usc,W',r) 

>/(Xs;Us|Usc,V^,r). 

Again since conditioning reduces entropy and Y is independent of iyV, T), we obtain from (12) that 

I{X;Y)-5< I{Y-V\T) 

= H{Y\T)-H{Y\V,T) 

< H{Y\W, T) - H{Y\V, W, T) 

= I{Y-\J\W,T). 
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Define f = {W, T). It is then clear that A, = (U, f) is in A^, 

Y^{Ri + 5 + (j){5)) > /(Xs;U5|Us<=,r) for all S C C, and 
les 

I{X;Y)-S<I{Y;V\f). 

Hence, {Ri+d + (j}{6), ...,Rl + 6 + (t){d), I{X; Y) - 6) is in nf\Xi), which implies that (R, I{X; Y)) is in 7^P because 
is closed from Lemma 1(b). Therefore, R is in TZf {l{X; Y)) . □ 



7 Gaussian Many-Help-One Hypothesis Testing against Independence 

We now turn to a continuous example of the problem studied in Section 4. This problem is related to the quadratic Gaussian 
many-help-one source coding problem ifTSl l23l l24l . We first obtain an outer bound similar to the one in Theorem 2 and then 
show that it is achieved by the Quantize-Bin-Test scheme. 

Let {X, Y,Xi,..., Xl) be a zero-mean Gaussian random vector such that 

Xi=X + Ni 

for each I in £. X and Y are correlated under the null hypothesis Hq and are independent under the alternate hypothesis Hi, 
i.e., 

Ho:Y = X + N 
Hi:Y±X. 

We assume that X, N, Ni, N2, . . . , Nl are mutually independent, and that aj^ and are positive. The setup of the problem 
is shown in Fig. |4] Unlike the previous problem, we now allow X to be observed by an encoder, which sends a message to 
the detector at a finite rate R. We use Z*^") to denote the corresponding encoding function. In order to be consistent with the 
source coding terminology, we call this the main encoder The encoder observing Xi is now called helper /. We assume that 
Xl+i and Z are deterministic. The rest of the problem formulation is the same as the one in Section 3.L Let JZ^^^o jj^g 
rate-exponent region of this problem. We need the entropy characterization of TZ^'^'^ . For that, define 



7^ 



MHO A 



u u 

" /("),(/<"') 



MHO 



V ' J lec 



where 



n 



MHO 



lec 



(i?, R, £:) : i? > - log 

^ ' n 



Ri >- log 



{xn 



E < -HY 

n 



for all I in C, and 



lec 



Corollary 2. 7^ 



MHO 



JlMHO 



The proof of this result is almost identical to that of Proposition 1. Define the set 

-j^MHO A J . . . , Rl,E) : there exists (ri, . . . , r^) G such that 



Ri > ri for all Z in £, and 

1/1 



les 



dWx 



E 

/es<= 



-2r, 



^ ri for all 5 C £ 



les 
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where 

D = {(yx + ^N)^ - '^N- 
Theorem 7. The rate-exponent region of this problem 

Proof. The proof of inclusion Jl'^^HO q j^mho jg similar to the converse proof of the Gaussian many-help-one source 
coding problem by Oohama ||231 and Prabhakaran et al. Il24l (see also lfT6l ). Their proofs continue to work if we replace the 
original mean square error distortion constraint with the mutual information constraint that we have here. It is noteworthy 
though that Wang et al.'s |[33| approach does not work here because it relies on the distortion constraint. 

We start with the continuous extension of Theorem 2. Let Aq be the set of random variables Xo — {U, U, W, T) such that 
each take values in a finite-dimensional Euclidean space and collectively they satisfy 

(C8) {W, T) is independent of (X, X, Y), 

(C9) U ^ {X, W, T) ^ (U, X, Y), 
(CIO) Ui [Xi ,W,T)^ {U, Uio , X, Xia , Y) for aU I in C, and 
(CI 1) the conditional distribution of Ui given {W, T) is discrete for each I. 

Define the set 

^°(Ao) =|(i?,R,£;) : Ri > I(Xi-Ui\X,W,T) for all I in C, (15) 

R + ^Ri> IiX]U,Us\Us'=,T) + ^I{Xi;Ui\X,W,T) for all S C C, and (16) 
les les 



E<IiY;U,V\T)}. (17) 



Finally, let 

We have the following lemma. 
Lemma 3. 7^f C 7^f 

The inequalities (16) and (17) can be established as in the proof of Theorem 2. In particular, we obtain (16) by considering 
only those constraints on the sum of rate combinations that include R. The inequality (15) is not present in Theorem 2. 
However, it can be derived easily. We need the following lemma. 



Lemma 4. 4761 Lemma 9] If Xo is in Aq, then for all S" C £, 

. 1 r,-2I{Xi;Ui\X,W,T) 
22/(X;Us|lV,T) < ^ V- i- ^ 

les 



Consider any {R, R, E) in TZf"^. Then there exists X^ in such that for all S QC, 

R + J2Ri>HX;U, Us |Usc ,T) + Y,HXi;Ui\X, W, T) 

les leS 

= liX; U, U|r) - I{X; Us^ \T) + ^ I{Xi; Ui \X, W, T), (18) 



les 
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and 



E < I(Y;U,IJ\T). 

We can lower bound the first term in (18) by applying the entropy power inequality ll34l and obtain 

22h(Y\U,U,T) _ 22''(^+^l'^'U,T) 

> 2'2h{X\U,\J,T) ^ 2^^^^) 



(19) 



which simplifies to 



Now (19) and (20) together imply 



h{Y\U, U, T) > ^ log (22'^(^I^'U'^) + 2^ea; 



I{X-U,\J\T)>\\og " 



2 ^(ai+a^)2-2i^-<- 



(20) 



(21) 



We next upper bound the second term in (18). Since conditioning reduces entropy and X is independent of {W, T), we have 

I{X- \5sAT) = h{X\T) - h{X\\]s^,T) 

< h{X\W,T)-hiX\Vs^,W,T) 

^I{X;Us^\W,T). (22) 

Define 

I{Xi;Ui\X,W,T). 

Then we have from (18), (21), (22), and Lemma 4 that 



i? + ^ i?i > J loi 



les 



1 



1 



2 \c)-2E _ ^2 \a 



'N 



E 



1 - 2-2n 



'N, 



En. 



On applying Lemma 3 and Corollary 2, we obtain JZ c TZ 



MHO r 'vMHO 



X 



Main Encoder 



Helper 1 



R 



X Ij > Helper L 



Rl 




Decoder — > X 



Figure 6: Gaussian many-help-one source coding problem 
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We use the Quantize-Bin-Test scheme to prove the reverse inclusion. Consider {R, R, E) in TZ^^^'~' . Then there exists 



r e M;^ such that 



Ri > ri for all lin C, and 



R + ^Ri>l\og+ 



1 / 1 



E 



1 - 2-2n 



D \ ai- '—^ aXj 



n for all sec. 



les 



We therefore have from Oohama's result ||23| that (i?, R, D) is achievable for the quadratic Gaussian many-help-one source 
coding problem, the setup of which is shown in Fig. [6] In this problem, the main encoder and helpers operate as before. The 
decoder however uses all available information to estimate X such that the mean square error of the estimate is no more than 
a fixed positive number D. Since (i?, R, D) is achievable, it follows by Oohama's achievability proof that for any positive 5 
and sufficiently large n, there exists quantize and bin encoders , f^'^ , ■ ■ ■ , and a decoder t/)^") such that 



i? + (5 > - log 
n 

1 



n 



n V 



for all I'm C, and 

2" 



where 



For each i, we have 



X" = 



E 



(y"(.)-X"(^))' 



= E 
= E 



2 



2" 



where the last equality follows because 



F"(i)oX''(i)ol"(i). 



By averaging over time, we obtain 



1 " 



1 " 



<a%+D + S, 



where the last inequality follows from (25). Therefore, the code achieves a distortion cr|, + D + S inY. Hence, 

i/('r";/(")(x"),(//")(xr))^ 



lec 

must be no less than the rate-distortion function of F at a distortion a% + D + S, i.e.. 



n \ V / ie£ J 2 a 



% + D + S 



N 



^2 , „2 



2 ^(ai + 02-2(^^-^-) 
E-5, 



(23) 
(24) 

(25) 



(26) 
(27) 



where (26) follows for a positive (5 such that 5 — > as (5 — !• 0. We now have from (23), (24), and (27) that {R, R, i?) is in 
nfHO_ Hence by Corollary 2, -R^iHO ^ 7^ a/ho ^ □ 
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7.1 Special Cases 

Consider the following special cases. We continue to use the terminology from the source coding literature. 

1. Gaussian CEO hypothesis testing against independence: When R = 0, the problem reduces to the Gaussian CEO 
hypothesis testing against independence problem. Let TiF^^ be the rate-exponent region of this problem. Define the 
set 



CEO A 



{Ri,...,Rl,E) : there exists r e such that 



les 



1 / 1 



E 



1 - 2-2'^' 



D \ air ^ — ' atr 



+ En forall Scrl. 
les J 



We immediately have the following corollary as a consequence of Theorem 7. 
Corollary 3. 7^^^o = 71^^'='. 

2. Gaussian one-helper hypothesis testing against independence: When L = 1, the problem reduces to the Gaussian one- 
helper hypothesis testing against independence problem. Let TZ'-^^ be the rate-exponent region of this problem. Define 
the sets 



OH A 



{R, Ri,E) : there exists ri G M+ such that 

Ri > ri, 
R + Ri > ^log+ 



R> log+ 
- 2 6 



D 


+ ri , and 


1 


1 - 2~2ri 




^ 2 



and 



n^^^^l {R,R,,E):R>-\og+ 



where 



Corollary 4. -RP" = K'^" = TZ^" . 

Proof. The first equaUty follows from Theorem 7. Consider any {R, Ri , E) in ikP^ . It must satisfy 



R > min max < - log''' 

- 0<ri<i?i I 2 ^ 



1 / 1 

D 



'X 



1 - 2-2'-i 



log 



D 



+ ri-Ri 



where the equality is achieved by 



ri=i?i + ^log(l-p2+p22-2J^i). 



(28) 



We therefore have that (i?, R\, E) is in IZP^, and hence '^'^^ C TZP^ . The proof of the reverse contaiimient follows 
by noticing that for any {R, Ri, E) in "RP" , there exists ri as in (28) such that all inequalities in the definition of ikP^ 
are satisfied. □ 
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8 A General Outer Bound 



We return to the general problem formulated in Section 3. The problem remains open till date. Several inner bounds are 
known for L = 1 101 15] |6l 13 • But even for L = 1, there is no nontrivial outer bound with which to compare the inner bounds. 
We give an outer bound for a class of instances of the general problem. 

Consider the class of instances such that Px — Qx, i e-, the marginal distributions of X are the same under both hypothe- 
ses. Stein's lemma ll34l asserts that the centralized type 2 error exponent for this class of problems is 



We have the following trivial centralized outer bound. 
Lemma 5. ncHc- 

Let S be the set of random variables Z such that there exists two joint distributions Pxyz and Qxyz satisfying 
(C12) J2z Py^YZ = PxY, the distribution under Ho, 
(C13) J2z QxYZ — Qxy, the distribution under Hi, 

(C14) QxYZ = Q:>i\zQY\zQz, i e., X and Y are conditionally independent given Z under the Q distribution, and 
(C15) Pxz = Q:x.z, i-e., the joint distributions of (X, Z) are the same under both distributions. 

Note that the joint distributions of {¥, Z) need not be the same under the two distributions. If Pxyz and Qxyz are the joint 
distributions of X, Y, and Z under Hq and Hi, respectively and Z is available to the detector, then the problem can be related 
to the L-encoder hypothesis testing against conditional independence. Now Z is not present in the original problem, but we 
can augment the sample space by introducing Z and supplying it to the decoder. The outer bound for this new problem is then 
an outer bound for the original problem. Moreover, we can then optimize over Z to obtain the best possible bound. 

Let X and be defined as in Section 4.2 with X^^i restricted to be deterministic. If S is nonempty, then for any 
{Z, X, Ao) in S X X X Ao, define the set 



Ec = D (P^yWQxy) , 



which is achieved when X and Y both are available at the detector. Let 



TZc = {{Il,E):E<Ec}. 



no{Z,X,X,)^\{R,E) ■.Y,Ri>HX;\Js\Vs^-,Z,T) + Y,HXi;Ui\X,W,Z,T) for all 5 C/:, and 




Finally, let 




We have the following outer bound to the rate-exponent region of this class of problems. 



Theorem 8. 7^ C 7^o n 7^c. 



Proof. In light of Proposition 1 and Lemma 5, it suffices to show that 



7^* c n, 
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Consider (R, E) in 7?.* .Then there exists a block length n and encoders /^'"^ such that 



Ri>- lo, 
n 

^s>lV(-r)), 



for all / in £, and 



(29) 
(30) 



Consider any Z in S. Then 



























-i 







Q 



yr 



Z^]+nD {Py\z\\Qy\z\Z) 



which together with (30) implies 



(31) 



It now follows from (29), (31), and Corollary 1 that [iL, (E - D {PY\z\\QY\z\Z))~^j is in 7^f ^. Therefore from Theorem 
2, it must also be in TZ^^. Hence for any X in x, there exists Aq in Aj, such that ^R, [E — D {Py\z\\Qy\z\Z))^^ is in 
7^^^(X,A,),i.e., 

^ P, > /(X; UsIUsc, T) + ^ I{Xi- Ui\X, W, Z, T) for all 5 C £, and 

{E-D {PY\z\\QY\z\Z)t < I{Y;V\Z,T). 



This means that (R, E) is in Uo{Z, X, Aq), and hence 7^* C 7^o. 



□ 



Although the outer bound above is not computable in general, it simplifies to the following computable form for the special 
case in which L — 1. Let 



ilA Pi |(i?i,£') : there exists Ui such 

zes 



that 



Ri>I{Xi;Ui\Z), 

E<I{Y;U,\Z) + D{Pyiz\\Qy\z\Z), 

< \Xi\ + l, and 
Ui^X,^{Y,Z)y 

Corollary 5. For 1-encoder general hypothesis testing, TZo = it and hence TZ CTZd TZc- 

Proof. It suffices to show that TZo = TZ. This immediately follows by noticing that given any Z in S, the outer bound can be 
related to the rate-exponent region of the 1-encoder hypothesis testing against conditional independence problem. The result 
then follows from Theorem 3. □ 

It is easy to see that the outer boimd is tight for the test against independence. 
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Corollary 6. (Test against independence, ^) If QxiY = PxiPy, then 



n = n. 



Proof. This follows by choosing Z to be deterministic in the outer bound and then invoking the result of Ahlswede and 



for all Z in ^, then the outer bound is no better than the centralized outer bound. 
8.1 Gaussian Case 

To illustrate this bound, let us consider a Gaussian example in which Xi and Y are zero-mean unit- variance jointly Gaussian 
sources with the correlation coefficients po and pi under Hq and Hi, respectively, where pQ ^ pi, p^ < 1, and < 1. We 
can assume without loss of generality that < pi < 1 because the case — 1 < pi < can be handled by multiplying Y by 
— 1. We use lowercase p and q to denote appropriate Gaussian densities under hypotheses Hq and Hi, respectively. Let TZ^ 
be the rate-exponent region of this problem. We focus on the following three regions (Fig. [7]i for which the outer bound is 



Csiszar |5|. 



□ 



Remark 2: The outer bound is not always better than the centralized outer bound. In particular, if 



D {Pyiz\\Qy\z\Z) > Ec 



nontrivial. 



2?i = {(Po,Pi):0<pi<po<l}, 

2^2 = {{po, Pi) ■0< Pi and 2pi - 1 < po < Pi}, 

2?3 = ((po,Pi) : -K Po < 2pi - 1 and ^^J^ 




) 




} 



Pi 




1 




po 



1 



Figure 7: Regions of pair (po, pi) for which the outer bound is nontrivial 



8.1.1 Outer Bound 



Let us define 




if (po,Pi) is in Vi U X'2 
if (po,Pi) is in V^. 
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c 



A 



and 

if (po, Pi) is in Pi U P2 

pi 

i-Pi 

The centralized type 2 error exponent is 

2 

Define the sets 



and 



2 °vi-Poy i-pf 

n^^{{Ri,E):E<ES}. 



We have the following outer boimd. 

Theorem 9. //(po, Pi) J'-s ©1 U D2 U P3, then 



7^'^ c n Teg. 



Prao/ The proof is in two steps: obtain a single letter outer bound similar to the one in CoroUary 5 and then use it to obtain 
the desired outer bound. Consider (po, Pi) in ^'i. Let Z, Z , W, and V be standard normal random variables independent of 
each other. Xi and Y can be expressed as 

Xi = + ^po-piZ' + Vl - PoW 

Y = sfpiZ + Vpo - Pi-Z' + \/l - pqV 

Xx = sfpiZ + ^l-piW 
Y = VpTZ + Vi-Pi^ 



imder Hq and as 



under H\. It is easy to verify that conditions (C12) through (C15) are satisfied if we replace the distributions by the corre- 
sponding Gaussian densities. Therefore, Z is in S. Define the set 

i,E) : there exists Ui such that 

Ri>I{Xi;Ui\Z), 

E < I{Y;Ui\Z) + D{pY\z\\qY\z\Z), and 

{Y,z)^ x^^u^y 

CoroUary l.nP ^tPr\ n% 

The proof is immediate as a continuous extension of Corollary 5. From Corollary 7, it suffices to show that 

-Rp C Tl^. 

Note first that 

D{pY\zhY\z\Z) = Q 
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here because the joint densities of {Y, Z) are the same under both hypotheses. Consider any , E) in TZ'~^. Then there exists 
a random variable Ui such that {Y, Z) ^ Xi ^ Ui, 

Ri > I{Xi;Ui\Z), and (32) 
E < I{Y;Ui\Z). (33) 

Since Xi,Y, and Z are jointly Gaussian under Hq, we can write that 

Y^pXi + ^{l-p)Z + B, 

where _B is a zero-mean Gaussian random variable with the variance 

and is independent of Xi and Z. We now have 

h{Y\Ui, Z)^h {pXi + ^(1 - p)Z + B\Ui,Z) 
^h{pXi + B\Ui,Z) 

> i log (2^'^ipXi\Ui,z) ^ 22'»(S)^ (34) 

= i log (^p222h(Xi|(7i,Z) ^ 22''(-B)^ 

= - log ^p222(/i(Xi|Z)-/(Xi;C/i|Z)) ^ 22''(^)^ 

= I log - p02-^^(^-^^I^) + (1 - Pi) (1 - P')) + I log(2.e) 

> i log - pi)2-^^^ + (1 - Pi) (1 - P')) + ^ log(27re), (35) 

where 

(34) follows from the entropy power inequality 1*341 because Xi and B are independent given (Ui, Z), and 

(35) follows because function 

fix) - ^ log (p2-2- + q) 
is monotonically decreasing in x foip > 0, and we have the rate constraint in (32). 
Now (33) and (35) imply 

''Y\Z 



1 / 

< - loE 



2 \p^l - pi)2-2i^i + (1 - pi) (1 - p2)^ 
o log 



2""° Vl-P^+P^2-2fli 
which proves that (i?i, E) is in 7?.^. This completes the proof for the region V 



as 



The proof is analogous for (po, pi) in the region 2?2- The only difference is that under Hq, Xi and Y can now be expressed 

Xi = ^Z + Vpi - Po^' + v/l-2pi+PoW^ 
F = \/pi 2^ - Vpi - PoZ + v^l - 2pi + poV. 

Suppose now that (po, pi) is in V^. One can verify that — po — Pi > here. Hence, Xi and Y can be expressed as 

^1 = VPiZ + \/-po - Pi^ + v^l + PoW^ 
^ = -^fPiZ - V-po - pi 2 + v^l + PoV' 
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under Hq. Their expressions under Hi are the same as before. It is evident that Z is in ^. Therefore, the outer bound in 
Corollary 7 is valid for this case, which implies that it suffices to show that 

Under Hq, the conditional distribution of Y given Z = zis Gaussian with the mean —^fp\z and the variance 1 — pi . Similarly 
under it is Gaussian with the mean J~p\z and the variance \~ p\. We therefore obtain 



D{VY\z\\(iY\z\'Z') 



t \^ f ^ I M PY\z{y\z), 

Pz[z)dz / pY\z(y\z)\og r-r-^dy 

zm Jym (lY\z(.v\z) 



Pz{z)dz / pY\z{y\z)log 
zem JyeB. 



exp 



Piz) 



pz{z)dz I PY\z{y\z) 

I JyeM. 

2(loge)^ 



2(1 -Pi) 
2(loge)^yz 



1-Pi 



dy 



I -Pi 
2(loge)7pr 

2(loge)pi r 

2(loge)pi 



zpz{z)dz I yPY\z{y\z)dy 



zpz{z)dz {-^/p{z) 
z'^Pzdz 



I -Pi 

Again, since Xi,Y, and Z are joindy Gaussian under Hq, we can write 



y = pXi- v^(i + p)z + B, 

where B is defined as before. The rest of the proof is identical to the region 2?i case. 



Piz) 



'2(1 -Pi) 



dy 



□ 



8.1.2 Ahlswede and Csiszar's Inner Bound 

We next compare the outer bound with Ahlswede and Csiszar's inner bound, which is obtained by using a Gaussian test 
channel to quantize Xi. One can use better inner bounds El!)^ but they are quite complicated and for the Gaussian case 
considered here, Ahlswede and Csiszar's bound itself is quite close to our outer bound in some cases. Let 



1, fl-pl{l-2-^'^-)\ (loge)pi(po- Pi) (1-2-2^0 



\1~pUI~ 2--0 ; 1 - p? (1 - J • 

Proposition 2. /5/ TZf CTZ^. 

Proof. Fix any (i?i , E) in TZf. Let Ui = Xi+ P, where P is a zero-mean Gaussian random variable independent of {Xi , Y) 
such that 

/(Xi;f/i) =i?i, 



which implies that the variance of P 
The covariance matrix of {Ui,Y) is 

under Hn and is 



Ki = 



1 



22fli _ 1 ■ 



1 + CTp Po 
Po 1 



1 + CTp Pi 

Pi 1 
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under Hi. It now follows from Ahlswede and Csiszar's scheme 15] Theorem 5] that the achievable exponent is 



Eac = D{puiY\\quiY) 



PU^Y{2)l0g —dz 

12 qUiY[Z) 

-ilog((27re)^det(Ko)) - / pu,y{z) log qu,Y{z)dz 

log ((27re)2 det(Ko)) - / pu,y{z) [-M^z^Kr^z - ^ log ((2^)^ det(Ki)) 
1, det(Ki) , (loge) 

1 det(Ki) 
2^°Sdet(Ko)~(^°^'^ 

2^°^l + a|,-p2) 



/ pt/iy(z) (z^K^ ^z) dz 



/zGR 

(loge)(l + (t|, - poPi 



loge 



det(Ki) 
(loge)(l + (T%- popi) 



iiogfi^iiii^^ 



(loge)pi (po - pi) (1 - 2-2^^ 
l-p2(i_2-2fli) 



This proves that {Ri,E) is in Tl^ . 

The inner and outer bounds coincide for the test against independence. 
Corollary 8. (Test against independence, il5] |35l/ ) If Xi and Y are independent under Hi, i.e., pi = 0, then 

n'^ = n'^ = n?. 



□ 



8.1.3 Numerical Results 




Figure 8: Outer and inner bounds for four examples 

Fig. [sjshows the inner and outer bounds for four examples. Fig. [8ja)-(c) are the examples when (po, Pi) is in 2?i U 2?2- 
Observe that the two bounds are quite close near zero and at all large rates. Fig. [8|d) is an example when (po, pi) is in 2?3. For 
this example, there is a gap between the inner and outer bounds at zero rate. This is due to the fact that in our outer bound, the 
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joint densities of (Y, Z) are different under the two hypotheses. Numerical results suggest that for a fixed po, the maximum 
gap between the inner and outer bounds decreases as we decrease pi and finally becomes zero at pi = 0, which is the test 
against independence. 

Remark 3: The outer bound can be extended to the vector Gaussian case. One can obtain a single letter outer bound similar 
to the one in Corollary 7. Then the outer bound can be optimized over all choices of Ui by using an invertible transformation 
ll36l 1371 and the scalar solution obtained above. It follows from our earlier work that the outer bound is tight for the test 
against independence ll38l . 
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Appendix A: Proof of Lemma 1 

The proof is rather well known and appears in source coding literature quite often. For instance, the similar proof can be 
found in |fT6ll . Let us define 

A, =|a, = (U,r) G Ai : < \Xi\ +2^-1 for all I € C, and 

in <2^}, 

and 

TZf^^ U ^?'(^^)- 

We want to show that TZf^ = TZf^. We start with the deterministic T case. Consider = (U,r) in A^, where T is 
deterministic. For any S C C containing 1, we have 

J(Xs; Us IU5C , Xl+1 , Z) = i/(Xs |Usc , Xl+i , Z) - H{Xs\Vi., Ui , Xl+i , Z), 

and for any nonempty S not containing 1 , we have 

/(Xs; UsIUse, Xi+i, Z) = I{Xs; Vs\Vs.\m,Ui,XL+i, Z). 

Moreover, 

I{Y-\J,Xl+i\Z) = H{Y\Xl+i,Z)-H{Y\{Ji.,Ui,Xl+i,Z). 
It follows from the support lemma [32 , Lemma 3.4, pp. 310] that there exists Ui with Ui C Ui such that 

l^^il < 1^-11+2^-1, 

Pr(Ari = a;i|[/i — ui)Pr(i7i = ui) = Pr(Xi = xi) for all xi in Xi but one, 

iJ(Xs|Uic,[/i,XL+i,Z) = ff(X5|Uic,l7i,Xi+i,Z) for all 5 containing 1, 
-f(Xs;Us|U5c\{i},[/i,XL+i,Z) = /(X5;Us|U5c\{i},C7i,Xl+i,Z) for all nonempty 5" not containing 1, 

and 

77(r|Uio , Ui,Xl+i,Z) = |Uic, (7i, Z). 
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Since 

Ui^x,^ (Uic,Xic,Xi+i,y,z), 

if we replace Ui by Ui then the resulting Aj is in A, and TZf^{Xi) remains unchanged. By repeating this procedure for 
U2,. ■ ■ , Ul, we conclude that there exists Aj = (U, T) in Aj such that T is deterministic and TZf^{Xi) = TZf^(Xi). 

We now turn to general T. Consider A, = (U, T) in Aj. Let (U, t) denote the joint distribution of (U, T) conditioned 
on {T = t}. It follows from the deterministic T case that for each t inT, there exists U such that (U, t) is in Aj and 
7^f ^ {V,t)= TZf^ (U, i) . Hence, on replacing U by U for each t in T, we obtain (U, T) in Aj such that jZY; | < | Af; | + 2^ - 1 
for all / in £ and Tef ^(U, T) = Tef ^(U, T). Now 7^f ^(U, T) is the set of vectors (R, E) such that 

Y,Ri > I(Ks;Vs\Vs^,Xl+„Z,T) for all S, and 

les 

E<IiY;V,XL+i\Z,T). 
It again follows from the support lemma that there exists T with T such that 

|T| < 2^, 

7(Xs;Us|Usc,Xi+i,Z,T) =/(Xs;Us|U5c,Xi+i,Z,T), and 
/(F; U, Xl+i\Z, T) = I{Y; U, Xl+i\Z, f ). 

We therefore have that Aj = (U, f) is in Aj and T^f ^(A,) = 1Zf'{Xi). This proves Tef ^ C Uf^, and hence Tef ^ = Hf' 
because the reverse containment trivially holds. 

For part (b), it suffices to show that 'Rf' is closed. Consider any sequence (R("\£J(")) in 'Rf'^ that converges to 
(R, E). Since conditional mutual information is a continuous function, A^ is a compact set. Hence, there exists a sequence 
^ (u("),r(")) in Ai that converges to Ai = (U,T)inAj such that (RW, isin:^P (Af^),i.e., 

^ > / (Xs; V^s^\V^s^\Xl+i,Z, t(")) for all S, and 

Again, by the continuity of conditional mutual information, this impUes that 

^ii; > 7(X5;Us|Usc,Xi+i,Z,T) for all S, and 

i;<7(y;U,Xi+i|Z,T). 

We thus have that (R, E) is in TZf^. 

Appendix B: Proof of Theorem 1 

We prove the deterministic T case. The general case follows by time sharing. Consider any Aj = (U, T) in Aj with T 
being deterministic. Consider (R, E) such that 

> /(Xs;Us|Usc,Xi+i,Z) for all SCC, and (36) 

les 

E<I{Y;V,Xl+i\Z). (37) 
It suffices to show that (R, E) belongs to the rate-exponent region 
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Consider a sufficiently large block length n, e > 0, and /i > 0. For each I in £, let Ri = I{Xi; Ui) + a, where a > 0. 
To construct the codebook of encoder I, we first generate 2"^' independent codewords [/", each according to n"=i ^Ui [uu), 
and then distribute them uniformly into 2"'^'+'^) bins. The codebooks and the bin assignments are revealed to the encoders 
and the detector. The encoding is done in two steps: quantization and binning. The encoder / first quantizes X" by selecting 
a codeword C/" that is jointly /i-typical with it. We adopt the typicality notion of Han ||6l. If there is more than one such 
codeword, then the encoder I selects one of them arbitrarily. If there is no such codeword, it selects an arbitrary codeword. 
The encoder then sends to the detector the index of the bin to which the codeword [/" belongs. In order to be consistent with 

(n) 

our earlier notation, we denote this encoding function by /j . It is clear that the rate constraints are satisfied, i.e.. 



-log 
n 



= Ri + e for all I in C. 



(38) 



The next lemma is a standard achievability result in distributed source coding. 

Lemma 6. For any (5 > 0, e > 0, /i > 0, and all sufficiently large n, there exists a function 

L L 



such that (a) if 

then P{V) >l~5; and (b) 



1=1 1=1 
V = {U", X£^;^, F", Z" are jointly ^i-typical under _ffo} , 



Pe 



One can prove this lemma using standard random coding arguments. See 
Applying this lemma to the hypothesis testing problem at hand, we have 



l27l |28]| for proofs of similar results. 



lec 



•y-n . vr, 



H{Y\Z) + ^H[{fl-^ {X- 



1 

n 

We can lower bound the second term in (39) as 



lec 



^L+li^ 



(39) 



if ((//") {xn 



lec 



x2+i,z- = 



> -H (U"|X£+i, Z") - -H f U" ^(") f (xn) , X£+i, 
n ' n \ \V ' /lec 

1 1 ^ 

> -i/(U"|X£+i,Z") - -HM-p,y2log\Ui\ 

1=1 

1 1 ^ 

> -i/(U"|X£+i,Z") 5Y,\og\Uil 



(40) 
(41) 

(42) 



1=1 



where 

(40) follows from data processing inequality |[34l Theorem 2.8. 1], 

(41) follows from Fano's inequality ll34l Theorem 2.10.1], and 
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(42) follows Lemma 6(b) and the fact that Hi,{pe) < 1 
The third term in (39) can be upper bounded as 



Jfl'+,.z"| < i//(u",(/,""(xr))_^^,y 



-H (U", y"|X2 , , , Z") . (43) 

n 



On applying bounds (42) and (43) into (39), we obtain 



1/ ((//") (XD 



> H {Y\Z) + -H (U" 1X2+1, z") - (u", - - - 5 V log iz^zl 

1=1 

1 1 ^ 

= i7(r|z)- -ij (r"|U",x£+i,z") 5y^\og\Ui\ 

1=1 

1 1 ^ 

= - -H (r", iv|u", z") log 

1=1 

= H{Y\Z) - -i/ (lv|U",XiVi,Z") - -i/(r"|U",X£V,ZMy) - - -^^log|i/,| 



> H (Y\Z) ----H (y"|U", X£ , 1, Z", ly = 1) P(V^) 

n n 

1 1 ^ 

- -H (y"|u",x£+i, z", = 0) p{vn sj^iogm 

n n ^ — ' 

1=1 

1 2 ^ 

> i? (y|Z) - -H (r"|U", X£+i, Z", Iv = 1) (Slog - (5 V log IZ^^I, (45) 



1=1 



(44) 



where 

(44) follows from the fact that H (ly |U", X£+i, Z") < 1, and 

(45) follows from Lemma 6(a) and the facts that 

hi (y"|u",x£+i, z", ly = 0) < log 

Piy) < 1. 

We now proceed to upper bound the second term in (45). Let TJ^{UXL-^-lYZ) be the set of all jointly /x-typical (u", a;2+i! 2/": 
sequences. We need the following lemma. 

Lemma 7. Lemma 1(d)] If n is sufficiently large, then for any (u", xj+j, y", z") in TJ^iVX^^iYZ), we have 
Py'^\U'^,X2^,.Z'^ (y"|u", z") > exp [-n (H (r|U, X^+i, Z) + 2/i)] . 

Using this lemma, we obtain 

li7(r"|U",X£+i,ZMv-l)--^ 5: Pu",x,v,V",Z"|i.=ilogPy.MU".x,v,z.M.=i 

— 2^ ^U",jsf" ,,y",Z"|iv=i iog — 

T^^{IJXl + iYZ) iv-i|>J ,A^_|_i,z 

< E Pu^,X2^,,Y'^,z^^iu.=i{H{Y\lJ,XL+i,Z) + 2ti) 

Ti:(\JXi^ + ^YZ) 

= H{Y\V,XL+l,Z) + 2^l. (46) 



30 



Substituting (46) into (45) gives 



vn , -\/-n 



\ 2 ^ 

Z^]>I (r ; U, Xl+^\Z) 2/. - 5 log 13^1 - 5 ^ log \Ui \ 

L 

>E-2,^i-5\og\y\-5Y,^og\Ul\, 



(47) 



where the last inequality foUows from (37) and the fact that n can be made arbitrarily large. We conclude from (38) and (47) 
that 



i?i + e, . . . , i?L + e, - 3/i - (5 log 13^1 - 5 ^ log \Ui \ 



is in 7?.f ^. Since this is true for any 5 > 0, e > 0, and /U > 0, we have that (R, E) is in TV^^ . This together with Corollary 1 
implies that (R, E) is in 'Rp' . 

Appendix C: Proof of Theorem 2 



Suppose (R, E) is in 7?.^^. Then there exists a block length n and encoders //"^ such that 



Ri>- log 
n 



for all / in £, and 



(48) 
(49) 



Consider any X in x- Let T be a time sharing random variable uniformly distributed over {1, . . . , n} and independent of 
(X",X£+i,X",y",Z"). Define 

Xi = X;"(r) for each / in £ U {L + 1}, 
X = X"(T), 

Y = y"(T), 

Z = Z'^iT), 

Ui = (XD , : T - 1), X£+i(r=), Z"(T^)) for each Z in £, and 

= (X"(r'^),X£+i(r'=),Z"(T'=)) . 

It is easy to verify that Ao = (U, W, T) is in Aq and 
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It suffices to show that (R, E) is in n^^{X, Ao). We obtain the following from (49) 



E < 



^ n r / 

= - ^ E{Y-(i)\Z-(i)) - H (Y-{i)\ (XD)^^^ : i - 

i=l '- ^ 

1 " r / 

< - ^ H{Y-{i)\Z-{i)) - H (y-{z)\ (XD)^^^ : z - : * - 1),X£ 

i=l '- ^ 

1 " r / \ 

f V _ _ \ \ ' I^jL- 



= I (y " (T) ; U, X£+i (T) I (T) , T) 
= /(y;U,Xi+i|Z,T), 



where 



les 

E 



(50) foUows from conditioning reduces entropy, and 

(51) foUows because of the Markov chain 

y"(l - : i - 1)) O (^(//"^ (Xf))^^^ , X£+i, Z"( 

Now let S C£. Then (48) implies 

nx:ii/>Ei°g|//"^ 

>gi?(//")(xr)) 



les 



) 



where 



(52) foUows from conditioning reduces entropy, and 

(53) follows because X is in x- 
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We next lower bound the second sum in (53). 

71 

= ^ J (xr(i); //"^ {xn\x\ xr(i -.i-i), x2+^, z^) 

i=l 

71 

= E (xr{i}\x",Xp{l : i - l),X£+i,Z") - H (xr{i)\f^"^ {Xn,X\Xl^{l : i - 1),X2+„Z") 

i=l 

71 

i=l 

71 

= ^/(xr(z);//"^ {xn\x\X2+„Z^ 

where (54) again follows from conditioning reduces entropy. On applying (55) in (53), we obtain 
1 " r / 

If 5° is nonempty, then continuing from (56) gives 

Y.Ri>I (X" (T) ; Us I Use , (T) , (T) , T) 

les 

+ ^ / (Xf (T) ; I X" (T) , X£+i (T) , Z" (T) , X" (T^) , X£+i (T^) , (T^) , T; 

= / (X; Us I Use , Xi+i , Z, T) + ^ 7 {Xi ■,Ui\X, W, X^+i , Z, T) . 

les 

Finally if 5 = £, then 

/ (x^ii); {Xn)^J (//"^ W))^^^^ : i - 1),X2+„Z-^ 

= 7 (x-{i); (XD)^^^ : i - 1),X£^,(0, Z"(n |x£+i(i), Z"(i)) . 

Substituting (57) into (56) yields 

^ > 7 {X; \J\Xl+i,Z, T) + ^ 7 (X,; [/; |X, W, Xl+i,Z, T) . 

This completes the proof of Theorem 2. 



(54) 
(55) 



(56) 



(57) 



Appendix D: Proof of Lemma 2 



It suffices to show that (C6) impUes (C7). The other direction immediately follows by letting e ^ 0. We can assume without 
loss of generahty that \X\ > 2 because the lemma trivially holds otherwise. Let X = {1,2,..., \X\} be the alphabet set of 
X. Let Pi be the ith row of the stochastic matrix Py\x corresponding to X = i. We need the following lemma. 

Lemma 8. If(C6) holds, then rows Pi corresponding to positive Px{i) are distinct. 

Proof. The proof is by contradiction. Suppose that Px(X) ^iid 7'x(2) are positive and P\ = Pi- Let us define a random 
variable XJ as 

2 ifX = l,2 



U ■ 



X otherwise. 
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The stochastic matrix Px\u has 

Pxil) 



Px\um 



Px{l)+Px{2)' 



Px\uii\i) = l for alii in {3,4, . . . , . 
It is easy to see that Y, X, and U form a Markov chain 

Y^X^U. (58) 

We now have 

\x\ 

H{Y\U)=Y,H{Y\U = i)Pu{i) 

= H{Y\U = 2)Pu{2) + Y,H{Y\U = i)Pu{i) 

i=3 

\ \x\ (\x\ \ 

= H ^PjPx\uU\2) Pu{2) + Y,H Y^P^Pxwm Puii) 

\j=l / i=3 \j=l / 

= H (P2) Pu{2) + (Pi) Pu{i) 

i=3 

= Y,H{Pi)Pu{i), (59) 

i=2 

and 

\x\ 

H{Y\X) = J2H{Pj)Px{j) 
\x\ /\x\ 

= Y.HiP^){Y.Pmij\i)Puii) 

j = l \i=2 

\x\ \x\ 

= ^Pu{i)J2p^\uimH{Pj) 

\x\ \x\ \x\ 

= Pu{2) J2 Px\uij\2)H {Pj) + J2Puii)Y.P^\uij\i)H {Pj) 

j=l i=3 j=l 

\X\ 

= Pu{2)H{P,) + Y,Pum{Pi) 
\x\ 

= Y,Pu{i)H {Pi) ■ (60) 

Now (58) through (60) together imply that /(X; Y\U) = 0, and hence F <H- f/ <H- X. However, 

\x\ 

H{X\U) = Y,H{X\U = i)Pu{i) 
= H{X\U = 2)Pu{2) 

>0, 

which contradicts our assumption that (C6) holds. □ 
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Consider any U that satisfies the Markov chain 

We can assume without loss of generality that Pu{u) is positive for all u inU because only positive Pu{u) contributes to 
H{X\U) and I{X; Y\U) in conditions (C6) and (C7). Then 

I{X-Y\U) = H{Y\U) - H{Y\X) 

\x\ 

= J2 H{Y\U = u)Pu{u) -Y,Pxm{Pi) 
ueu <=i 

>l \ 1^1 



ueu \i=i / i=i \ueu J 

' (\x\ \ \x\ 

H Y.PiPx\u{i\u) -Y,Px\uii\u)H{Pi) 



i=l I i=l 



= J2Pu{u)T{Px\u{.\u)), (61) 
ueu 

where (61) follows by setting 

\ \x\ 

T {Px\u{-\u)) ^ H I Y,PiPx\u{i\n) -J2Px\u{i\u)H{P,). 

\i=l J j=l 

Since entropy is a strictly concave and continuous function, T is a nonnegative continuous function of Px\u i-lu). Moreover, 
for any u inU, Px\u{i\u) = for all iin X such that = 0. Let V denote the set of all such Px\u{-\u). Define 

j{S) ^ sup {H{P):TiP)< 6}. 
Per 

It now follows from Lemma 8 that if T{P) = for some P in "P, then P must be a point mass and hence H{P) = 0. 
Therefore, 7(0) = 0. We next show that 7 is continuous at 0. Consider a nonnegative sequence 6n 0. Then there exists a 
sequence of distributions PninV such that 

T{Pn) < (5„ (62) 

H{Pn) > (63) 

Now, since the set of all distributions on is a compact set, by considering a subsequence, we can assume without loss of 
generality that P„ converges to P in V. By letting n 00 in (62), we obtain that T(P) = 0, i.e., P is a point mass. Therefore, 
H{P) = 0. It now follows from (63) that 7(^71) — )• = 7(0) as n — )• 00. Hence, 7 is continuous at 0. 

Fix < e < log \X\ (condition (C7) is always true for e > log Choose ei > such that 7 (ei/log | A"!) + ei = e. Set 
S = (ei/log \X\f. Let I{X; Y\U) < 6. Define the sets 

Z^i = {w e : T{u) < Vs} and U2 =U\Ui. 

Note that Ui is nonempty because 5 <1. We now have 

5>I{X-Y\U) 

= Y,Pu{u)T{u) 

u 

>Y,Pu{u)T{u) 
U2 

>V~Sj2Puiu), 
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which impUes 

Y,Pu{u)<V6. 

Hence, 

H{X\U) = H{X\U = u)Pu{u) + H{X\U = u)Pu{u) 

Ui U2 

<-f{Vs) + Vs\og\x\ 

= 7(ei/log|A'|) + ei 
= e. 
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